An adaptable sentence segmentation based on Indonesian rules
نویسندگان
چکیده
<p>Sentence segmentation that breaks textual data strings into individual sentences is an important phase in natural language processing (NLP). Each word the string added a punctuation mark such as period, question mark, or exclamation point, becomes location for splitting string. Humans can easily see and split sentences, but not machines. Basically, three marks also perform other functions so sentence process must really be able to detect whether marked with boundary not. This research proposes system called segmentasi kalimat bahasa Indonesia (SKBI) Indonesian Sentence Segmentation by applying set of rules used texts adapted English. There are 34 built combination 27 fairly complete features contribute this research. The experimental results text show SKBI achieve F1-Score 96.89% 97.07% Both need improved now better than previous research.</p>
منابع مشابه
Semantic passage segmentation based on sentence topics for question answering
We propose a semantic passage segmentation method for a Question Answering (QA) system. We define a semantic passage as sentences grouped by semantic coherence, determined by the topic assigned to individual sentences. Topic assignments are done by a sentence classifier based on a statistical classification technique, Maximum Entropy (ME), combined with multiple linguistic features. We ran expe...
متن کاملSATZ - An Adaptive Sentence Segmentation System
The segmentation of a text into sentences is a necessary prerequisite for many natural language processing tasks, including part-of-speech tagging and sentence alignment. This is a non-trivial task, however, since end-ofsentence punctuation marks are ambiguous. A period, for example, can denote a decimal point, an abbreviation, the end of a sentence, or even an abbreviation at the end of a sent...
متن کاملAn Adaptable Dialog Interface Agent Using Meta-Rules
We examine adaptable dialogue interface agents using meta-rules to engage in to converse with information seeking callers. For an adaptable dialogue interface agent to exhibit the intelligence required in a dialogue with a user it needs the ability to handle unanticipated user input gracefully in a proper context, that is, it can support ad hoc requests from a user, maintain the topic of discou...
متن کاملClassical Chinese Sentence Segmentation
Sentence segmentation is a fundamental issue in Classical Chinese language processing. To facilitate reading and processing of the raw Classical Chinese data, we propose a statistical method to split unstructured Classical Chinese text into smaller pieces such as sentences and clauses. The segmenter based on the conditional random field (CRF) model is tested under different tagging schemes and ...
متن کاملAn Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio
It is often needed to label electroencephalogram (EEG) signals by segments of similar characteristics that are particularly meaningful to clinicians and for assessment by neurophysiologists. Within each segment, the signals are considered statistically stationary, usually with similar characteristics such as amplitude and/or frequency. In order to detect the segments boundaries of a signal, we ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IAES International Journal of Artificial Intelligence
سال: 2023
ISSN: ['2089-4872', '2252-8938']
DOI: https://doi.org/10.11591/ijai.v12.i3.pp1491-1499